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Abstract 

A multi-hop relaying system is analyzed where data sent by a multi-antenna source is relayed by 



. successive multi-antenna relays until it reaches a multi-antenna destination. Assuming correlated fading 

■ at each hop, each relay receives a faded version of the signal from the previous level, performs linear 

precoding and retransmits it to the next level. Using free probability theory and assuming that the 
noise power at relaying levels — but not at destination — is negligible, the closed-form expression of the 
O . asymptotic instantaneous end-to-end mutual information is derived as the number of antennas at all levels 

grows large. The so-obtained deterministic expression is independent from the channel realizations while 
J> ' depending only on channel statistics. Moreover, it also serves as the asymptotic value of the average 

, end-to-end mutual information. The optimal singular vectors of the precoding matrices that maximize 

, the average mutual information with finite number of antennas at all levels are also provided. It turns 

• out that the optimal precoding singular vectors are aligned to the eigenvectors of the channel correlation 

O 

' matrices. Thus they can be determined using only the known channel statistics. As the optimal precoding 



singular vectors are independent from the system size, they are also optimal in the asymptotic regime. 
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Fig. 1. Multi-level Relaying System 

I. Introduction 

Relay communication systems have recently attracted much attention due to their potential to substan- 
tially improve the signal reception quality when the direct communication link between the source and 
the destination is not reliable. Due to its major practical importance as well as its significant technical 
challenge, deriving the capacity - or bounds on the capacity - of various relay communication schemes 
is growing to an entire field of research. Of particular interest is the derivation of capacity bounds for 
systems in which the source, the destination, and the relays are equipped with multiple antennas. 

Several works have focused on the capacity of two-hop relay networks, such as [l]-[7]. Assuming fixed 
channel conditions, lower and upper bounds on the capacity of the two-hop multiple-input multiple output 
(MIMO) relay channel were derived in [1]. In the same paper, bounds on the ergodic capacity were also 
obtained when the communication Unks undergo i.i.d. Rayleigh fading. The capacity of a MIMO two-hop 
relay system was studied in [2] in the asymptotic case where the number of relay nodes grows large 
while the number of transmit and receive antennas remain constant. The scaling behavior of the capacity 
in two-hop amplify-and-forward (AF) networks was analyzed in [3]-[5] when the numbers of single- 
antenna sources, relays and destinations grow large. The achievable rates of a two-hop code-division 
multiple-access (CDMA) decode-and-forward (DF) relay system were derived in [8] when the numbers 
of transmit antennas and relays grow large. In [6], an ad hoc network with several source-destination pairs 
communicating through multiple AF-relays was studied and an upperbound on the asymptotic capacity 
in the low Signal-to-Noise Ratio (SNR) regime was obtained in the case where the numbers of source, 
relay and destination nodes grow large. The scaling behavior of the capacity of a two-hop MIMO relay 
channel was also studied in [7] for bi-directional transmissions. In [9] the optimal relay precoding matrix 
was derived for a two-hop relay system with perfect knowledge of the source-relay and relay-destination 
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channel matrices at the relay. 

Following the work in [10] on the asymptotic eigenvalue distribution of concatenated fading channels, 
several analysis were proposed for more general multi-hop relay networks, including [11]-[15]. In 
particular, considering multi-hop MIMO AF networks, the tradeoffs between rate, diversity, and network 
size were analyzed in [11], and the diversity-multiplexing tradeoff was derived in [12]. The asymptotic 
capacity of multi-hop MIMO AF relay systems was obtained in [13] when all channel links experience 
i.i.d. Rayleigh fading while the number of transmit and receive antennas, as well as the number of relays 
at each hop grow large with the same rate. Finally hierarchical multi-hop MIMO networks were studied 
in [15] and the scaling laws of capacity were derived when the network density increases. 

In this paper, we study an A^-hop MIMO relay communication system wherein data transmission from 
ko source antennas to kj\[ destination antennas is made possible through A^ — 1 relay levels, each of 
which are equipped with ki, i = I, . . . , N — I antennas. In this transmission chain with A^ + 1 levels 
it is assumed that the direct communication link is only viable between two adjacent levels: each relay 
receives a faded version of the multi-dimensional signal transmitted from the previous level and, after 
linear precoding, retransmits it to the next level. We consider the case where all communication links 
undergo Rayleigh flat fading and the fading channels at each hop (between two adjacent levels) may 
be correlated while the fading channels of any two different hops are independent. We assume that the 
channel at each hop is block-fading and that the channel coherence-time is long enough — with respect 
to codeword length — for the system to be in the non-ergodic regime. As a consequence, the channel is 
a realization of a random matrix that is fixed during a coherence block, and the instantaneous end-to-end 
mutual information between the source and the destination is a random quantity. 

Using tools from the free probability theory and assuming that the noise power at the relay levels, but 
not at the destination, is negligible, we derive a closed-form expression of the asymptotic instantaneous 
end-to-end mutual information between the source input and the destination output as the number of 
antennas at all levels grows large. This asymptotic expression is shown to be independent from the channel 
realizations and to only depend on the channel statistics. Therefore, as long as the statistical properties 
of the channel matrices at all hops do not change, the instantaneous mutual information asymptotically 
converges to the same deterministic expression for any arbitrary channel realization. This property has 
two major consequences. First, the mutual information in the asymptotic regime is not a random variable 
any more but a deterministic value representing an achievable rate. This means that when the channel is 
random but fixed during the transmission and the system size is large enough, the capacity in the sense 
of Shannon is not zero, on the contrary to the capacity of small size systems [16, Section 5.1]. Second, 
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given the stationarity of channel statistical properties, the asymptotic instantaneous mutual information 
obtained in the non-ergodic regime also serves as the asymptotic value of the average end-to-end mutual 
information between the source and the destination. Note that the latter is the same as the asymptotic 
ergodic end-to-end mutual information that would be obtained if the channel was an ergodic process. 

We also obtain the singular vectors of the optimal precoding matrices that maximize the average mutual 
information of the system with a finite number of antennas at all levels. It is proven that the singular 
vectors of the optimal precoding matrices are also independent from the channel realizations and can 
be determined only using statistical knowledge of channel matrices at source and relays. We show that 
the so-obtained singular vectors are also optimal in the asymptotic regime of our concern. The derived 
asymptotic mutual information expression and optimal precoding singular vectors set the stage for our 
future work on obtaining the optimal power allocation, or, equivalently, finding the optimal precoding 
singular values. Finally, we apply the aforementioned results on the asymptotic mutual information and 
the structure of the optimal precoding matrices to several communications scenarios with different number 
of hops, and types of channel correlation. 

The rest of the paper is organized as follows. Notations and the system model are presented in Section HIl 
The end-to-end instantaneous mutual information in the asymptotic regime is derived in Section JIIJ while 
the optimal singular vectors of the precoding matrices are obtained in Section |IVl Theorems derived in 
Sections |lll] and |IV] are applied to several MIMO communication scenarios in Section |Vl Numerical 
results are provided in Section |Vl] and concluding remarks are drawn in Section IVIII 

II. System Model 

Notation: log denotes the logarithm in base 2 while In is the logarithm in base e. u{x) is the unit- 
step function defined by uix) = if 2; < ; uix) = 1 if x > 0. Kim) = D , . .> ^ is the 

^ ' \ / \ / ju -^l— msin y 

complete elliptic integral of the first kind [17]. Matrices and vectors are represented by boldface upper 
and lower cases, respectively. A-^, A*, A^ stand for the transpose, the conjugate and the transpose 
conjugate of A, respectively. The trace and the determinant of A are respectively denoted by tr(A) and 
det(A). Aa(1), • • • , AA(?^) represent the eigenvalues of an n x n matrix A. The operator norm of A 
is defined by ||A|| = y^maxj Aa«a(0, while the Frobenius norm of A is \\A\\f = Y^tr(A^A). The 
(i,j)-th entry of matrix A^ is written a-^^. I^v is the identity matrix of size N. E[-] is the statistical 
expectation operator, Tt{X) the entropy of a variable X, and I{X; Y) the mutual information between 
variables X and Y. F^{-) is the empirical eigenvalue distribution of an n x n square matrix Q with 
real eigenvalues, while Fii{-) and /rj(-) are respectively its asymptotic eigenvalue distribution and its 
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eigenvalue probability density function when its size n grows large. We denote the matrix product by 
Aj = Ai A2 . . . Atv- Note that the matrix product is not commutative, therefore the order of the 
index i in the product is important and in particular (0^^ Aj)^ = <S)i=N ^f- 

A. Multi-hop MIMO relay network 

Consider Fig. [U that shows a multi-hop relaying system with source antennas, kiq destination 
antennas and N — 1 relaying levels. The i— th relaying level is equipped with ki antennas. We assume 
that the noise power is negligible at all relays while at the destination the noise power is such that 

E[zz^] = o-^I = -I (1) 

ri 

where z is the circularly-symmetric zero-mean i.i.d. Gaussian noise vector at the destination. The simpli- 
fying noise-free relay assumption is a first step towards the future information-theoretic study of the more 
complex noisy relay scenario. Note that several other authors have implicitly used a similar noise-free 
relay assumption. For instance, in [12] a multi-hop AF relay network is analyzed and it is proved that 
the resulting colored noise at the destination can be well-approximated by white noise in the high SNR 
regime. In a multi-hop MIMO relay system, it can be shown that the white-noise assumption would be 
equivalent to assuming negligible noise at relays, but non-negligible noise at the destination. 

Throughout the paper, we assume that the correlated channel matrix at hop i G {!,..., N} can be 
represented by the Kronecker model 

H, ^ Cl'f@^C][^ (2) 

where Ct,i,^r,i are respectively the transmit and receive correlation matrices, 0j are zero-mean i.i.d. 
Gaussian matrices independent over index i, with variance of the (A;, /)-th entry 

nefA'] = ^ ^ = l,...,N (3) 

where Oi = represents the pathloss attenuation with f3 and di denoting the pathloss exponent and 
the length of the i-th hop respectively. We also assume that channels matrices Hj, i = 1^ . . . ,N remain 
constant during a coherence block of length L and vary independently from one channel coherence block 
to the next. 

Note that the i.i.d. Rayleigh fading channel is obtained from the above Kronecker model when matrices 
Ct,i and Cr,j are set to identity. 

Within one channel coherence block, the signal transmitted by the /cq source antennas at time I G 
{0, . . . , L — 1} is given by the vector xo(/) = Poyo(^ — 1), where Pq is the source preceding matrix and 
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yo is a zero-mean random vector with 

E{yoy?} = ifco (4) 

which implies that 

E{xox^} = PoP?. (5) 

Assuming that relays work in full-duplex mode, at time I G {0, . . . , L — 1} the relay at level i uses 
a precoding matrix Pj to linearly precode its received signal yi{l — 1) = HjXj_i(/ — 1) and form its 
transmitted signal 

Xi(0 = Piyi(Z-l) i = 0,...,N-l (6) 

The precoding matrices at source and relays Pj, i = 0, . . . , — 1 are subject to the per-node long-term 
average power constraints 

tr(E[x,xf ]) <kiVi i = 0,...,N-l. (7) 

The fact that y^ = HjXi_i, along with the variance E[|0^*;^p] = of Hj elements and with the power 
constraint tr(E[xj_ix^^]) < fcj_i'Pj„i on Xj_i, render the system of our concern equivalent to a system 

(i) 

whose random channel elements would be i.i.d. with variance Cj and whose power constraint on 
transmitted signal Xj_i would be finite and equal to Vi-i. Having finite transmit power at each level, 
this equivalent system shows that adding antennas, i.e. increasing the system dimension, does not imply 
increasing the transmit power. Nonetheless, in order to use random matrix theory tools to derive the 
asymptotic instantaneous mutual information in Section |lIIJ the variance of random channel elements is 
required to be normalized by the size of the channel matrix. That is why the normalized model — channel 
variance ([3]) and power constraint ([7]) — was adopted. 

It should also be noticed that choosing diagonal precoding matrices would reduce the above scheme 
to the simpler AF relaying strategy. 

As can be observed from Fig. [T] the signal received at the destination at time / is given by 

YNil) = H^Piv-iHjv-iPjv-2...H2PiHiPoyo(/- A^) + z 

= Gjvyo(/-iV) + z (8) 
where the end-to-end equivalent channel is 

Gat = HArPAr_iHAr_iPAr_2 • • • H2P1H1P0 

= c;/>,vc;/jP7v-ic;/j_i0iv-ic;/;_iPiv-2 . . . ci^^2@2ci^2Pi&j,^@ici!iPo- (9) 
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Let us introduce the matrices 

Mo = C,'/'Po 

M. = C,%P.C^/' i = l,...,N-l 

Mn = Cl[l. (10) 

Then © can be rewritten as 

Gtv = M7V07vMAr_i0Ar_i . . . M202Mi0iMo. (11) 

For the sake of clarity, the dimensions of the matrices/vectors involved in our analysis are given below. 

X,- 

: ki X ki—\ Cf'^i ki x ki Ct^i • ^2—1 ^ ki—i 

• k'i X k^i ]^ ^^'^i • ki X k'i 

In the sequel, we assume that the channel coherence time is large enough to consider the non- 
ergodic case and consequently, time index I can be dropped. Finally, we define three channel-knowledge 
assumptions: 

• Assumption Ag, local statistical knowledge at source: the source has only statistical channel state 
information (CSI) of its forward channel Hi, i.e. the source knows the transmit correlation matrix 
Ct,i. 

• Assumption Ar, local statistical knowledge at relay: at the i*'' relaying level, i ^ {1, . . . , N — 1}, 
only statistical CSI of the backward channel Hj and forward channel Hj+i are available, i.e. relay 
i knows the receive correlation matrix Cr,i and the transmit correlation matrix Ct^i+i- 

• Assumption Aj, end-to-end perfect knowledge at destination: the destination perfectly knows the 
end-to-end equivalent channel Gn- 

Throughout the paper, assumption A^ is always made. Assumption Aj is the single assumption on 
channel-knowledge necessary to derive the asymptotic mutual information in Section JIIJ while the two 
extra assumptions Ag and Ar are also necessary in Section |IV] to obtain the singular vectors of the 
optimal precoding matrices. 

B. Mutual Information 

Consider the channel realization Gat in one channel coherence block. Under Assumption Aj, the 
instantaneous end-to-end mutual information between channel input yo and channel output (y at, Gat) in 
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this channel coherence block is [16] 

Iiyo;yN\GN = Gn) = 'H{yN\GN = Gn) - 'H{yN\yo, Gn = Gn) 



■H{z) (12) 

= H{yN\GN = GN)-'H{z) 
The entropy of the noise vector is known to be ^{{z) = logdet(^Ifcj^). Besides, yo is zero-mean with 
variance E[yoy^] = thus given Gn, the received signal y^r is zero-mean with variance GatG^ + 
By [16, Lemma 2], we have the inequality H{yN\GN = Gn) < log det(7reGArG^ + and 
the entropy is maximized when the latter inequality holds with equality. This occurs if y at is circularly- 
symmetric complex Gaussian, which is the case when yo is circularly-symmetric complex Gaussian. 
Therefore throughout the rest of the paper we consider yo to be zero-mean a circularly-symmetric complex 
Gaussian vector. As such, the instantaneous mutual information ([T2l ) can be rewritten as 

I{yo;yN\GN = Gn) =logdet{U^ + r]GNG^). (13) 

Under Assumption Aj, the average end-to-end mutual information between channel input yo and 
channel output (y at, Gat) is 

I{yo; {yN,GN)) =I{yo;yN\GN) +I{yo;GN) 



= '^{yo\yN\GN) (^4) 

= EG„[2:(yo;yAr|GAr = Gn)] 

= EG„[logdet(Ifc„ +r?G7vG^)]. 
To optimize the system, we are left with finding the precoders Pj that maximize the end-to-end mutual 
information (fT4b subject to power constraints dTJl. In other words, we need to find the maximum average 
end-to-end mutual information 

C = max Eg„ [logdet(Ifc„ + r] GjyG^)] (15) 

Note that the non-ergodic regime is considered, therefore (IT4l ) represents only an average mutual in- 
formation over channel realizations, and the solution to (fTSl ) does not necessarily represent the channel 
capacity in the Shannon sense when the system size is small. 
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III. Asymptotic Mutual Information 

In this section, we consider the instantaneous mutual information per source antenna between the 
source and the destination 

I ^ log det(Ifc„ + 7?G^G^) (16) 

feo 

and derive its asymptotic value as the number of antennas kQ,ki, . . . ,kj\f grow large. The following 
theorem holds. 

Theorem 1: For the system described in section [111 assume that 

• channel knowledge assumption Ad holds; 

• kn,ki, . . . ,kN ^ oo while P — > pi for i = 0, N; 

• for i = 0, . . . , A^, as ki — > oo, M^Mj has a limit eigenvalue distribution with a compact support. 



Then the instantaneous mutual information per source antenna / converges almost surely to 

TV r / N -1 , TV 

I ■ 

On '■ — ' \ ~ I nn 

i=0 

where oat+i = 1 by convention, /iq, ^i, • • • , /itv are the solutions of the system of A^ + 1 equations 



, TV p . Ml ^ 

l^p.E log(l + r,^/.fA,) -N^^r^Wh. (17) 



N 

hj = piE 

j=0 



i = 0,...,N (18) 



and the expectation E[-] in ([TT] ) and ([TSl l is over Aj whose distribution is given by the asymptotic 
eigenvalue distribution i^M^M, (^) of M^Mj. 

The detailed proof of Theorem [7] is presented in Appendix |lll 

We would Uke to stress that (ITtI ) holds for any arbitrary set of precoding matrices Pj, i = 0,...,A^ — 1, 
if M^Mj has a compactly supported asymptotic eigenvalue distribution when the system dimensions 
grow large. We would like to point out that the power constraints on signals transmitted by the source 
or relays are not sufficient to guarantee the boundedness of the eigenvalues of Mf^Mj. In fact, as 
( 11231 ) in Appendix |lll] shows, in the asymptotic regime the power constraints impose upper-bounds 
on the product of first-order moment of the eigenvalues of matrices PjCr,iPf^ and M^^Mfc — in- 
deed linifc.^oo ^tr(P,Cr,iPf ) = E[Ap^cpf ] and linifc^^oo ^tr(Ct,fc+iPfcC^,fcPf ) = E[Afc]. Un- 
fortunately, these upper-bounds do not prevent the eigenvalue distribution of M^Mj from having an 
unbounded support. Thus, the assumption that matrices M^Mj have a compactly supported asymptotic 
eigenvalue distribution is a priori not an intrinsic property of the system model, and it was necessary to 
make that assumption in order to use Lemma |2] to prove Theorem [7J 
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Given a set of preceding matrices, it can be observed from ([TT] ) and ([TSl l that the asymptotic expression 
is a deterministic value that depends only on channel statistics and not on a particular channel realization. 
In other words, for a given set of precoding matrices, as long as the statistical properties of the channel 
matrices do not change, the instantaneous mutual information always converges to the same deterministic 
achievable rate, regardless of the channel reahzation. Thus, as the numbers of antennas at all levels grow 
large, the instantaneous mutual information is not a random variable anymore and the precoding matrices 
maximizing the asymptotic instantaneous mutual information can be found based only on knowledge of 
the channel statistics, without requiring any information regarding the instantaneous channel realizations. 
This further means that when the channel is random but fixed during the transmission and the system 
size grows large enough, the Shannon capacity is not zero any more, on the contrary to the capacity of 
small-size systems [16, Section 5.1]. 

Moreover, given the stationarity of channel statistical properties, the instantaneous mutual information 
converges to the same deterministic expression for any arbitrary channel realization. Therefore, the 
asymptotic instantaneous mutual information ([TT] ) obtained in the non-ergodic regime also represents 
the asymptotic value of the average mutual information, whose expression is the same as the asymptotic 
ergodic end-to-end mutual information that would be obtained if the channel was an ergodic process. 

It should also be mentioned that, according to the experimental results illustrated in Section |Vll the 
system under consideration behaves like in the asymptotic regime even when it is equipped with a 
reasonable finite number of antennas at each level. Therefore, ([TT] ) can also be efficiently used to evaluate 
the instantaneous mutual information of a finite-size system. 

IV. Optimal Transmission Strategy at Source and Relays 

In previous section, the asymptotic instantaneous mutual information ([TT] ). ([TSl l was derived considering 
arbitrary precoding matrices Pj,i G {0, . . . , — 1}. In this section, we analyze the optimal linear 
precoding strategies Pj,f G {0, ... — 1} at source and relays that allow to maximize the average 
mutual information. We characterize the optimal transmit directions determined by the singular vectors 
of the precoding matrices at source and relays, for a system with finite kQ,ki, . . . , kj\[. It turns out that 
those transmit direction are also the ones that maximize the asymptotic average mutual information. As 
explained in Section |llll in the asymptotic regime, the average mutual information and the instantaneous 
mutual information have the same asymptotic value, therefore the singular vectors of the precoding 
matrices maximizing the asymptotic average mutual information are also optimal for the asymptotic 
instantaneous mutual information ([TT] ). 
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In future work, using the results on the optimal directions of transmission (singular vectors of P/) and 
the asymptotic mutual information ([TTll-dTS]). we intend to derive the optimal power allocation (singular 
values of Pj) that maximize the asymptotic instantaneous/average mutual information ([TT] ) using only 
statistical knowledge of the channel at transmitting nodes. 

The main result of this section is given by the following theorem: 

Theorem 2: Consider the system described in Section |lll For z G {!,••• , -/V} let Ct j = Uj jAt jU^j 
and Cr,i = Ur,iAr,iU^j be the eigenvalue decompositions of the correlation matrices Ct^i and Cr,i, 
where Ut,i and \Jr,i are unitary and At,i and A,, j are diagonal, with their respective eigenvalues ordered 
in decreasing order. Then, under channel-knowledge assumptions Ag, Ar and Aj, the optimal linear 
precoding matrices that maximize the average mutual information under power constraints Q can be 
written as 

Po = Uj,iAp„ 

(19) 

P, = Ut,i+iAp,U^, , for i G {1, . . . , - 1} 
where Ap are diagonal matrices with non-negative real diagonal elements. Moreover, the singular vectors 
of the precoding matrices ([T9l ) are also the ones that maximize the asymptotic average mutual information. 
Since the asymptotic average mutual information has the same value as the asymptotic instantaneous 
mutual information, the singular vectors of the precoding matrices ([T9l ) are also optimal for the asymptotic 
instantaneous mutual information. 

For the proof of Theorem |2j the reader is referred to Appendix Jill 
Theorem |2] indicates that to maximize the average mutual information 

• the source should align the eigenvectors of the transmit covariance matrix Q = PoPq^ to the 
eigenvectors of the transmit correlation matrix Ct,i of the first-hop channel Hi. This alignment 
requires only local statistical channel knowledge Ag. Note that similar results were previously 
obtained for both single-user [18] and multi-user [19] single -hop (without relays) MIMO system 
with covariance knowledge at the source. 

• relay i should align the right singular vectors of its precoding matrix Pj to the eigenvectors of the 
receive correlation matrix Cr,i, and the left singular vectors of Pj to the eigenvectors of the transmit 
correlation matrix Ct,j+i. These alignments require only local statistical knowledge Ar. 

Moreover, it follows from Theorem |2] that the optimization of Pj can be divided into two decoupled 
problems: optimizing the transmit directions — singular vectors — on one hand, and optimizing the transmit 
powers — singular values — on the other hand. 
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We would like to draw the reader's attention to the fact that the proof of this theorem does not rely on 
the expression of the asymptotic mutual information given in (ITtI ). In fact, Theorem |2] is first proved in 
the non-asymptotic regime for an arbitrary set of {^'i}ie{o,...,iV}- As such, regardless of the system size, 
the singular vectors of the precoding matrices should always be aligned to the eigenvectors of the channel 
correlation matrices to maximize the average mutual information. In particular, the singular vectors of 
the precoding matrices that maximize the asymptotic average mutual information are also aligned to the 
eigenvectors of channel correlation matrices as in ([T9l ). As explained in Section |llll the instantaneous and 
the average mutual informations have the same value in the asymptotic regime. Therefore, the singular 
vectors given in ( fT9l ) are also those that maximize the asymptotic instantaneous mutual information. 

V. Application to MIMO Communication Scenarios 

In this section. Theorem [T] and Theorem |2] are applied to four different communication scenarios. In 
the first two scenarios, the special case of non-relay assisted MIMO (N=l) without path-loss (ai = 1) is 
considered, and we show how ( fTT] ) boils down to known results for the MIMO channel with or without 
correlation. In the third and fourth scenarios, a multi-hop MIMO system is considered and the asymptotic 
mutual information is developed in the uncorrelated and exponential correlation cases respectively. 

A. Uncorrelated single-hop MIMO with statistical CSI at source 

Consider a simple single-hop uncorrelated MIMO system with the same number of antennas at source 
and destination i.e. po = pi = 1, and an i.i.d. Rayleigh fading channel i.e. Ct i = C^j = I. Assuming 
equal power allocation at source antennas, the source precoder is Pq = y/PoI. As Mq = C/^ Pq = VVq^ 

1/2 

and Ml = = I, we have that 



clF^H^^iX) = 6{X - l)dX. 
Using the distributions in ( [201 ) to compute the expectations in ( fTTl ) yields 

N r / \ -1 , N 



(20) 



iogfi+^/.fA.Al-iv^.n/^. 

V Pi )\ T-a 



i=o (21) 
= log (1 + r]hoVo) + log(l + i]hi) - lege t] ho hi 
where, according to (fTSl) . ho and hi are the solutions of the system of two equations 



, 1 

ho 



hi 



l + rjhi 
1 + r]hoVo 



(22) 
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that are given by 



ho 



Using (|23] ) in dlTb . we obtain 



1 + yi + 4r?Po 

-1 + VI + 4r?Po 
= 2^^ • 



1 + Vl + 4??Po\ loge 



Joo = 21og( ^_^J__^(Vl + 4,?Po-l) . (24) 



It can be observed that the deterministic expression (|24]) depends only on the system characteristics 
and is independent from the channel realizations. Moreover, equal power allocation is known to be the 
capacity-achieving power allocation for a MIMO i.i.d. Rayleigh channel with statistical CSI at source 
[20, Section 3.3.2], [16]. As such, the asymptotic mutual information (l24l) also represents the asymptotic 
capacity of the system. We should also mention that (l24l) is similar to the expression of the asymptotic 
capacity per dimension previously derived in [20, Section 3.3.2] for the MIMO Rayleigh channel with 
equal number of transmit and receive antennas and statistical CSI at the transmitter. 

B. Correlated single-hop MIMO with statistical CSI at source 

In this example, we consider the more general case of correlated MIMO channel with separable 

1/2 1/2 

correlation: Hi = @iC^\ . Let us denote the eigenvalue decomposition of Ct^i as 

Ct,i = Ut,iAt,iUfi (25) 

where i is a diagonal matrix whose diagonal entries are the eigenvalues of C^.i in the non-increasing 
order and the unitary matrix Ut^i contains the corresponding eigenvectors. Defining the transmit covari- 
ance matrix 

Q ^ E [xox^] = PoP?, (26) 

it has been shown [18] that the capacity-achieving matrix Q* is given by 

Q'^ = U^^iAq.U^ (27) 

where Aq. is a diagonal matrix containing the capacity-achieving power allocation. Using Theorem |7] 
along with (1251) and (l27l ). it can be readily shown that the asymptotic capacity per dimension is equal to 

C = E[log(l + ^Ao/io)] + -E[log(l + 7?Ai/ii)] - i^T? hoh, (28) 

PO PO PQ 
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where /iq and hi are the solutions of the system 



ho = E 



Ai 



/ii = E 



1 + ??Ai/ii 
Ao 



(29) 




and the expectations are over Aq and Ai whose distributions are given by the asymptotic eigenvalue 
distributions of iAq* and C^, respectively. It should be mentioned that an equivalent expressioij^ was 
obtained in [20, Theorem 3.7] for the capacity of the correlated MIMO channel with statistical CSI at 
transmitter. 

C. Uncorrelated multi-hop MIMO with statistical CSI at source and relays 

In this example, we consider an uncorrelated multi-hop MIMO system, i.e. all correlation matrices 
are equal to identity. Then by Theorem \2\ the optimal precoding matrices should be diagonal. Assuming 
equal power allocation at source and relays, the precoding matrices are of the form Pj = ailfc., where 
Oi is real positive and chosen to respect the power constraints. 

Using the power constraint expression (11231 ) in Appendix |llll it can be shown by induction on i that 
the coefficients in the uncorrelated case are given by 



Oat = 1. 

Then the asymptotic mutual information for the uncorrelated multi-hop MIMO system with equal 
power allocation is given by 



'The small differences between J28t and the capacity expression in [20, Theorem 3.7] are due to different normalization 
assumptions in [20]. In particular J28l > is the mutual information per source antenna while the expression in [20] is the capacity 
per receive antenna. The equivalence between [20, Theorem 3.7] and l |28l ) is obtained according to the following notation 
equivalence ({ [20] -notation} ~ {([28}-notation}): 





yi e {!,..., N -1} 



(31) 




(32) 



C ~ po-Too /3 ~ po SNR ~ Vol] r ~ — T" :?r 



A.R ~ Ai , both with distribution given by the eigenvalue distribution of Cr 
A ~ ^ , both with distribution given by the eigenvalue distribution of At^iAq* /To 



(30) 
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where /iq, /ii, . . . , /iat are the solutions of the system of + 1 multivariate polynomial equations 



1-0 1 + —7— 

Note that the asymptotic mutual information is a deterministic value depending only on a few system 
characteristics: signal power Vi, noise power l/rj, pathloss Oj, number of hops N and ratio of the number 
of antennas pi. 

D. Exponentially correlated multi-hop MIMO with statistical CSI at source and relays 

In this example, the asymptotic mutual information (fTTl ) is developed in the case of exponential 
correlation matrices and precoding matrices with optimal singular vectors. 

Optimal precoding directions: For i G {1, . . . , N}, the eigenvalue decompositions of channel correlation 
matrices Ct^i and Cr,i can be written as 

(34) 

(~i . — TT A TT^ 

^r,i — r,i-'-^r,i^ r,i 

where U^^j and XJr^i are unitary, and j and A^^i are diagonal with their respective eigenvalues ordered in 
decreasing order. Following Theorem\2l we consider precoding matrices of the form Pj = j+iAp^U^^, 
i.e. the singular vectors of Pj are optimally ahgned to the eigenvectors of channel correlation matrices. 
Consequently, we can rewrite matrices M^Mj (fTOl ) as 

M^Mo = \J^^A%At,i\Jt,i 

Mf M, = Vi^^,Ar,iAlAt,^+lVr,^ i = 1, . . . , N - 1 (35) 

M^M^ = U^^A,,7vU^,iV. 
Thus, the eigenvalues of matrices M^M/ are contained in the following diagonal matrices 

Ao = A%At,i 

A, = Ar,iA%At,i+i i = l,...,N-l (36) 
An = Ar^N- 

The asymptotic mutual information, given by (fTTl ) and ([TSl l. involves expectations of functions of Aj 
whose distribution is given by the asymptotic eigenvalue distribution F-^h-^^{X) of M^^Mj. Equation 
(|36l ) shows that a function 51 (Aj) can be written as a function (j(2(Ap , A^^i, At,j+i), where the variables 
Ap , A,, j, and At,j+i are respectively characterized by the asymptotic eigenvalue distributions Fpffp.(A), 
Fc,_,(A), and Fc, ,+i(A) of matrices PfP^ , Cr,i and CtA+i respectively. Therefore expectations in 
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([TT] ) and (fTSl ) can be computed using the asymptotic joint distribution of (Ap , j, A^ j+i) instead of 
the distribution i^M^M, (^)- To simplify notations, we rename the variables as follows 

,2 



a: = A 



Pi 



Y = Kr 



Z = At,i 



+1- 



(37) 



Then, the expectation of a function gi(Aj) can be written 

E[gi{Ai)] =E[g2{X,Y,Z)] = / j g2{x,y, z)fx.Y,z{x,y, z) dx dy dz 



z Jy J X 

g2{x,y,z)fx\Y,z{x\y,z) fY\z{y\z) fz{z)dxdydz. 



(38) 



' z J y J X 

Exponential Correlation Model: So far, general correlation matrices were considered. We now introduce 
the exponential correlation model and further develop (l38l) for the distributions fY\z{y\z) and fz{z) 
resulting from that particular correlation model. 

We assume that Level i is equipped with a uniform linear array (ULA) of length Lj, characterized by 
its antenna spacing /j = Li/ki and its characteristic distances j and Ar,i proportional to transmit and 
receive spatial coherences respectively. Then the receive and transmit correlation matrices at Level i can 
respectively be modeled by the following Hermitian Wiener-clas Toeplitz matrices [22]-[24] : 



1 



r2. 



2 



and Ct 



,1+1 



kiXki 



1 rt,i+i rl^^^ 



k,-l 



t,i+l 



t,i+l 



1 n,i+l 

^M+1 1 



(3§) 



where the antenna correlation at receive (resp. transmit) side r^^j = e G [0, 1) (resp. j+i = 
e G [0, 1)) is an exponential function of antenna spacing /j and characteristic distance j (resp. 
Af j ) at relaying Level i. 

As ki grows large, the sequence of Toeplitz matrices C,. j of size ki x k^ is fully characterized by the 



sequence of ti x n Toeplitz Matrices T„ — [tk-j]nxn is said to be in the Wiener class [21, Section 4.4] if the sequence 
{tk} of first-column and first-row elements is absolutely summable, i.e. lim„^+oo 'l2k=-n l^'^l < 



lf\rr.i\ < 1, then limfc,^+oo(EfcL,/ 7-r,i + Efc=_fe,_i ' r,i J - TrT^ 
class. Ct,i is obviously also in the Wiener class if \rt.i\ < 1. 



f ) — - — z h i^i*/^'' < CO, and consequently Cr,i is in the Wiener 
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continuous real function defined for A G [0,27r) by [21, Section 4.1] 

ki-l -1 

/,,,(A)= lim |^r,V''+ E 



fc=0 k=-{ki-l) 



1 ^ rr,ie~^^ (40) 



1 - rr,ie^^ 1 - rr,ie-i^ 
1 — r; 



r,i 



|1 - r,.ieJ'^|2' 

We also denote the essential infimum and supremum of fr,i by mj^^ and Mj^ ^ respectively [2 1 , Section 
4.1]. In a similar way, we can define the continuous real function ft,i+i characterizing the sequence of 
Toeplitz matrices Ct^i+i by replacing r^^j in ( |40l ) by j+i, and we denote by mf^^^-^ and Mj^.^^ its 
essential infimum and supremum respectively. 

By Szego Theorem [21, Theorem 9], recalled hereafter in Lemma^ for any real function g{-) (resp. 
h{-)) continuous on [ruf^ ^, Mf^ .] (resp. [mf^ .^-^,Mf^^^^]), we have 

(/r,^(A)) dX 



[ g{y)fY {y) dy = ^ V 5- (Ac,,, (fc)) = 77- / 9 

Jy fc.^+oo fej ^ ' 27r Jo 

ki 2lT 

[ h{z)fz{z)dz^ hm l|^/i(Ac,,,,(/c)) = r h{h,+i{v)) 



(41) 



fc=i 

Assuming that variables Y = A^^j and Z = A^ j+i are independent, and applying Szego Theorem to 
381 ). we can write 



Ebi(Ai)] = ^y" (^j g2ix,y,z)fx\Y,z{x\y,z) dx^ fyiy) fz 



[z) dy dz 



93{y, z)fY{y) dyj fz{z) dz 

2ir \ (42) 



Lii 

/ / ^) ^'^) -^^(^^ ' Theorem (gB 



1 

2^ 



( [ 93ifr,iW,z) fz{z)dz] dX 
A=0 VJz 

27r /■27r 



= TTT-y? 93 {fr,iWJt,i+i{v)) dX dv , by Szego Theorem (gB- 

(27rj^ 7a=o Jv=f) 

Equal power allocation over optimal precoding directions: We further assume equal power allocation 
over the optimal directions, i.e. the singular values of Pj are chosen to be all equal: Ap. = ajifc,, where 
ai is real positive and chosen to respect the power constraint Q. Equal power allocation may not be the 
optimal power allocation scheme, but it is considered in this example for simplicity. 
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Using the power constraint expression for general correlation models (11231 ) in Appendix |lll] and 
considering precoding matrices P, = U^j(ajIfcjUt^j+i with optimal singular vectors as in Theorem 
|2] and equal singular values a^, we can show by induction on i that the coefficients a-i respecting the 
power constraints for any correlation model are given by 

"0 = 



= . IK \ rrr — i v Vz e {1, . . . ,iV - 1} (43) 

aN = 1- 

Applying the exponential correlation model to (03]) and making the dimensions of the system grow large, 
it can be shown that in the asymptotic regime, the respecting the power constraint for the exponentially 
correlated system converge to the same value (OTI ) as for the uncorrelated system. 

Then X = Ap. = af is independent from Y and Z, thus fx\Y,z{Ay^^) = fx{x) = 6{x — af). 
Consequently, 

93{y,z) = / g2ix,y,z)5{x - a^) dx = g2ia{,y,z) (44) 

J X 

and (l42l) becomes 

Eto(A.)l = pip £ £ ["l JT^:^^) <«) 

Asymptotic Mutual Information: Using (l45l) in ([Tt] ) with g2{x,y,z) = log (^1 + rj^^hf^ xyz^ gives 
the expression of the asymptotic mutual information 



^oo = > / / log 1 + ?Al2|-i T^M^^^-^ 

Poi2TTy Jx=oJ,y=o \ Pi\l-rr,ie3^y\l-rt,i+ie3^\^ j po 

(46) 

where ho, hi, ... , h^ are the solutions of the following system of + 1 equations, obtained by using 
dlSl) in ([H]) with g2{x, y, z) = ^^hf^^^f^ 



n ' (27r)2 A^oi^^o p,|l ^ r.,e^-Ap|l _ r^.+ie^-l^ + r;/^f a,+iaf (1 - J(l - r^^^) ^ ^^^^ 
for i = 0, . . . , TV 

(with the convention r^-fi = at+i = 0). Using the changes of variables 
t = tan — , thus cos(Aj = ^ and a\ 



2 / ' '1-1-/2 1-1-/2 

'v\ , , , 1 - ti"' , , 2du 
u = tan — , thus coslz^ = k and ai/ = ^ 
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and performing some algebraic manipulations that are skipped for the sake of conciseness, (l46l ) and (|47] ) 
can be rewritten 

r]hfa^+ia1 (l + i^) (1 + u^) \ dt du ,Joge 

' — A* 77 



V 2/ / \0g [ I + Cr,rCt.t+l- 



(49) 

where ho, hi, . . . , hj\i are the solutions of the system of + 1 equations 

hj = ' . ' Kirrii) C50) 

ycr.ict.i+i + — — ^J7—^^+ — J, — 



and 



1 ^^r^i 



r,i 

1 - ^,1+1 



i + n,i+i (51) 



mj = 1 



Those expressions show that only a few relevant parameters affect the performance of this complex 
system: signal power Vi, noise power 1/??, pathloss Oj, number of hops A^, ratio of the number of 
antennas pi, and correlation ratios Cr,i and ct^i- 

VI. Numerical Results 

In this section, we present numerical results to validate Theorem [7] and to show that even with small 
ki,i = 0, . . . ,N, the behavior of the system is close to its behavior in the asymptotic regime, making 
Theorem\l}a useful tool for optimization of finite-size systems as well as large networks. 

A. Uncorrelated multi-hop MIMO 

The uncorrelated system described in Section IV-CI is first considered. 

Fig- El plots the asymptotic mutual information from Theorem 1 as well as the instantaneous mutual 
information obtained for an arbitrary channel realization (shown as experimental curves in the figure). 
This example considers a system with 10 antennas at source, destination and each relay level with one, 
two or three hops. Fig. [3]plots the same curves as in Fig. [2] for a system with 100 antennas at each level. 
When increasing the number of hops N , the distance between source and destination d is kept constant 
and A^ — 1 relays are inserted between source and destination with equal spacing di = d/N between each 
relaying level. In both examples, whose main purpose is not to optimize the system, but to validate the 
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asymptotic formula in Theorem [7] matrices Pi are taken proportional to the identity matrix to simulate 
equal power allocation. The channel correlation matrices are also equal to the identity matrix to mimic 
the uncorrelated channel. Moreover, the pathloss exponent /3 = 2 is considered. We would like to point 
out that the experimental curves for different channel realizations produced similar results. As such, the 
experimental curve corresponding to a single channel realization is shown for the sake of clarity and 
conciseness. 

Fig. |3] shows the perfect match between the instantaneous mutual information for an arbitrary channel 
realization and the asymptotic mutual information, validating Theorem [7] for large network dimensions. 
On the other hand Fig. |2] shows that the instantaneous mutual information of a system with a small number 
of antennas behaves very closely to the asymptotic regime, justifying the usefulness of the asymptotic 
formula even when evaluating the end-to-end mutual information of a system with small size. 

Finally, Fig. |4] plots the asymptotic mutual information for one, two, and three hops, as well as the 
value of the instantaneous mutual information for random channel realizations when the number of 
antennas at all levels increases. The concentration of the instantaneous mutual information values around 
the asymptotic limit when the system size increases shows the convergence of the instantaneous mutual 
information towards the asymptotic limit as the number of antennas grows large at all levels with the 
same rate. 



B. One-sided exponentially correlated multi-hop MIMO 

Based on the model discussed in Section |V-D[ the one-sided exponentially correlated system is consi- 
dered in this section. In the case of one-sided correlation, e.g. r,. j = and j > for alH G {0, . . . , N}, 
the asymptotic mutual information (l52l ). (1531 ) is reduced to 

Ioo = y / log l + ■ — 2T M~ — 2"^ V\\hi (52) 

^ POTT y p, (Cf^,+ i +U^) J po fj^ 

where hQ,hi, . . . ,hj\i are the solutions of the system of + 1 equations 

TT /i - . ' ' 

One-sided correlation was considered to avoid the involved computation of the elliptic integral K{mi) 
in the system of equations (1531 ). and therefore to simplify simulations. 

Fig. [5] and |6] plot the asymptotic mutual information for 10 and 100 antennas at each level respectively, 
and one, two or three hops, as well as the instantaneous mutual information obtained for an arbitrary 
channel realization (shown as experimental curves in the figure). As in the uncorrelated case, the perfect 
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match of the experimental and asymptotic curves in Fig. [6] with 100 antennas vaUdates the asymptotic 
formula in Theorem [7] in the presence of correlation. Fig. [5] shows that even for a small number of 
antennas, the system behaves closely to the asymptotic regime in the correlated case. 

Finally, Fig. |7] plots the instantaneous mutual information for random channel realizations against the 
size of the system and shows its convergence towards the asymptotic mutual information when the number 
of antennas increases. Comparing Fig. |7]to the corresponding Fig. |4]in the uncorrelated case, it appears 
that convergence towards the asymptotic limit is slower in the correlated case. 

VII. Conclusion 

We studied a multi-hop MIMO relay network in the correlated fading environment, where relays at each 
level perform linear precoding on their received signal prior to retransmitting it to the next level. Using 
free probability theory, a closed-form expression of the instantaneous end-to-end mutual information 
was derived in the asymptotic regime where the number of antennas at all levels grows large. The 
asymptotic instantaneous end-to-end mutual information turns out to be a deterministic quantity that 
depends only on channel statistics and not on particular channel realizations. Moreover, it also serves 
as the asymptotic value of the average end-to-end mutual information. Simulation results verified that, 
even with a small number of antennas at each level, multi-hop systems behave closely to the asymptotic 
regime. This observation makes the derived asymptotic mutual information a powerful tool to optimize 
the instantaneous mutual information of finite-size systems with only statistical knowledge of the channel. 

We also showed that for any system size the left and right singular vectors of the optimal precoding 
matrices that maximize the average mutual information are aligned, at each level, with the eigenvectors 
of the transmit and receive correlation matrices of the forward and backward channels, respectively. 
Thus, the singular vectors of the optimal precoding matrices can be determined with only local statistical 
channel knowledge at each level. 

In the sequel, the analysis of the end-to-end mutual information in the asymptotic regime will first 
be extended to the case where noise impairs signal reception at each relaying level. Then, combining 
the expression of the asymptotic mutual information with the singular vectors of the optimal precoding 
matrices, future work will focus on optimizing the power allocation determined by the singular values of 
the precoding matrices. Finally future research directions also include the analysis of the relay-clustering 
effect: given a total number of antennas ki at level i, instead of considering that the relaying level consists 
of a single relay equipped with many antennas (fcj), we can consider that a relaying level contains n, relays 
equipped with (ki/rii) antennas. Clustering has a direct impact on the structure of correlation matrices: 
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when the ki antennas at level i are distributed among several relays, correlation matrices become block- 
diagonal matrices, whose blocks represent the correlation between antennas at a relay, while antennas 
at different relays sufficiently separated in space are supposed uncorrelated. In the limit of a relaying 
level containing ki relays equipped with a single antenna, we fall back to the case of uncorrelated fading 
with correlation matrices equal to identity. The optimal size of clusters in correlated fading is expected 
to depend on the SNR regime. 

Appendix I 
Transforms and lemmas 

Transforms and lemmas used in the proofs of Theorems [7] and |2] are provided and proved in this 
appendix, while the proofs of Theorems |7] and |2] are detailed in Appendices JI] and |llll respectively. 

A. Transforms 

Let T be a square matrix of size n with real eigenvalues At(1), • • • , ^T(n). The empirical eigenvalue 
distribution of T is defined by 

1 " 

F^{x)^-y2n{x-XT{i)). (54) 

i=\ 

We define the following transformations [10] 

Stieltjes transform: Gt(s) — / -; dF'Y(X) (55) 

J A - s 

Tt(s) = J Y^dFT:{X) (56) 
z + 1 

S-transform: St{z) = T^^{z) (57) 

where T~^(T(s)) = s. 

B. Lemmas 

We present here the lemmas used in the proofs of Theorems\l\a.n&^ Lemmas [71 121 E] and [7| are proved 
in Appendix II-CI while Lemmas \2\\6\ and Hare taken from [25], [21], and [26] respectively. 

Lemma 1: Consider an n x p matrix A and a. p x n matrix B, such that their product AB has 
non-negative real eigenvalues. Denote ^ = f ■ Then 

Sab{z) = ^Sba (^) ■ (58) 
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Note that Lemma [7] is a more general form of the results derived in [27, Eq. (1.2)], [10, Eq. (15)]. 

Lemma 2 ( [25, Prop. 4.4.9 and 4.4.11]): For n G IN, let p(n) G M be such that ^ ^ ^ as n ^ oo. 
Let 

• Q(n) be a p{n) x n complex Gaussian random matrix with i.i.d. elements with variance ^. 

• A{n) be a n X n constant matrix such that sup„ ||A(?i)|| < +cxd and (A(n), A(n)^) has the limit 
eigenvalue distribution fj,. 

• B(n) be a p{n) x p{n) Hermitian random matrix, independent from ®{n), with an empirical 
eigenvalue distribution converging almost surely to a compactly supported probability measure u. 

Then, as n — > oo, 

• the empirical eigenvalue distribution of 0(n)^B(n)0(n) converges almost surely to the compound 
free Poisson distribution vr^^g [25] 

• the family ({0(n)^B(n)0(n)}, {A(n), A(ri)^}) is asymptotically free almost everywhere. 
Thus the limiting eigenvalue distribution of 0(n)B(?i)0(n)^A(n)A(n)^ is the free convolution tTj^^^KI// 
and its S-transform is 

S&^&HAA^iz) = S(^^@h[z)Saah{z). (59) 

Note that if the elements of 0(n) had variance instead of ^, ({0(n)^B(n)0(n)}, {A(n), A{n)^}) 
would still be asymptotically free almost everywhere, and consequently. Equation ( [59l l would still hold. 

Lemma 3: Consider an n x p matrix A with zero-mean i.i.d. entries with variance -. Assume that the 
dimensions go to infinity while ^ — > C^ then 



a (1 + Cz) 

^\ (60) 

Sa^aIz) = - — — T. 

a {z + Q 

Lemma 4 ( [26, Theorem H.l.h[): Let A and B be two positive semi-definite hermitian matrices of 
size nxn. Let Aa {i) and Ab («) be their decreasingly-ordered eigenvalues respectively. Then the following 
inequality holds: 

n n n 

Y^Xj^{i)X^{n-i + l) < tr(AB) = J^AabW < 5^Aa«Ab«. (61) 

i=l i=l i=l 

Lemma 5: For i G {1, . . . , N}, let Aj be a ?ij x rij-i random matrix. Assume that 

• Ai, . . . , Aat are mutually independent, 

• Hi goes to infinity while — > Q, 
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• as 7ii goes to infinity, the eigenvalue distribution of AjA|^ converges almost surely in distribution 
to a compactly supported measure fj, 

• as ni,...,ni\i go to infinity, the eigenvalue distribution of {(S)i=N -^i)i^i=N -^i)^ converges 
almost surely in distribution to a measure fi^- 

Then is compactly supported. 

Lemma 6 ( [21, Theorem 9]): Let T„ be a sequence of Wiener-class Toeplitz matrices, characterized 
by the function /(A) with essential infimum ruf and essential supremum Mf. Let At„(1), • • • , \T„(n) 
be the eigenvalues of T„ and s be any positive integer. Then 

lim - V (k) = ^ r fiXydX. (62) 

n^oo n ZTT Jq 

Furthermore, if /(A) is real, or equivalently, the matrices T„ are all Hermitian, then for any function 
g{-) continuous on [mj,Mj] 

lim 1 V5(At„(A:)) = ^ r g{f{X))dX. (63) 
Lemma 7: For i > 1, given a set of deterministic matrices {-^k}k<^{o,...,i} and a set of independent 
random matrices {0fc}fc(={i,...,i}, with i.i.d. zero-mean gaussian elements with variance af., 



tr E 



(g){A,0,}AoA^(g){0|^Af} 

k=i k=l 



tr(AoA^) J];a2tr(AfcA|^). (64) 



k=l 



C. Proofs of Lemmas 

The proofs of Lemmas [71 |2J |5] and [7| are given hereafter. 
Proof of Lemma |7] 

Given two complex matrices A of size mxn, and B of size nxm, their products AB and B A have the 
same k non-zero eigenvalues Aab(1)i • • • , AabC^) with the same respective multiplicities mi, . . . , m^. 
However the multiplicities mo and ttiq of the 0-eigenvalues of AB and BA respectively, are related as 
follows: 

mo + n = m'o + m. (65) 

Assuming that AB, and therefore BA, has real eigenvalues, we show hereafter that (1581 ) holds. 
The empirical eigenvalue distributions of AB and BA are defined by 

1 

FabW = —uiX) + -y^m,u{X - AAB(i)) 
m m ^-^ 

(66) 



F^aW = ^n{X) + -f] m,u{X - Aab(O)- 



n n 

i=l 
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Using (|65] ). we get 



FIbW = ^F^aW + fl - ^ ) n(A). (67) 



m \ m 

From (l67l) . it is direct to show that 



Gab(^) = -Gba{z) - f 1 - -) -. (68) 



m \ 771/ z 

As T(s) = -1 - iG(^), from dM), we obtain 



77 

Tab(s) = -Tba(s). (69) 

m 

Finally, using {z = Tab(s) = ^Tba(s)} <^ i'^Asi^) = s = Tg^ (^)} and the definition of the 
S-transform S{z) = ^^T^^{z) yields the desired result 

Sab{z) = ^5ba (^) . (70) 

2+:^ \n/mj 

This concludes the proof of Lemma [7] 

■ 

Proof of Lemma |5] 

Consider an 77, x n matrix A with zero-mean i.i.d. entries with variance -. Let X = 4=A denote 
the normalized version of A with zero-mean i.i.d. entries of variance - and define Y = al„ and 
Z = XX^Y = AA-^. It is direct to show that Sy{z) = \. Using the latter result along with [10, 
Theorem 1], we obtain 



S^y^H{z) 



(71) 



(1 + C^) 

Saa^{z) = Sz{z) = Sx:x.'^{z)Sy{z) = , ^ . -■ 
Applying Lemma [7] to 5a«a(-2^) yields 

S\H\(z) = — — S\\H ( - I = - -. (72) 

This completes the proof of Lemma |21 

■ 

Proof of Lemma |5] 

The proof of Lemma |5] is done by induction on N. For = 1, Lemma |5] obviously holds. Assuming 
that Lemma |5] holds for N, we now show that it also holds for + 1. 

We first recall that the eigenvalues of Gramian matrices AA^ are non-negative. Thus the support of 
^N+i is lower-bounded by 0, and we are left with showing that it is also upper-bounded. 
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Bat+i = Atv+iBatA;^ , (73) 



Denoting Bn = {<S>i=N ^i)i'S>l=N -^i)" ^ can write 

For a matrix A, let AA,max denote its largest eigenvalue. The largest eigenvalue of Btv+i is given by 

Biv+i X 



ABjv+i,max = max 



X^X 



: max 



max 



Aat+iBtvA^,,,^ X 



x^x 



tr(BAr A^_^^XX^AAr+i 
x^x 



< max > by Lemma\4\ 

X X-'^X 

Zfc=l -^A«_^iXx^fA„+i(^) 

< max ABjv,max 77 

X ' X^X 

tr(A^^^xx^Ajv+i) 
max -jj 



(74) 



x^x 

x^Ajv+iA^_^^x 

ABjv.max max tt 

X X^X 



— AB„,max -^AN+iA^^nmax- 

To simplify notations, we rename the random variables as follows: 

X = ABjv+i,max ^ = ABjv,max Z = AA„^iA^^i,max- (75) 

Then (1741 ) can be rewritten 

X < YZ. (76) 

Let a > 0, by ( 1761 ) we have 

Fx {a) = Fv{X <a}> Ft{YZ < a} = Fyzia) (77) 

which still holds for the asymptotic distributions as ni, . . . ,nAr+i — > oo, while — > Q. Denoting the 
plane region Va = {x,y > 0/xy < a}, we can write 



Fyzia) = I fY,z{y,z)dydz 



/ / fY{y)fz{z)dydz , by independence of Y and Z 

+00 / f-a/y \ (78) 

fz{z)dz fY{y)dy 



y=0 \Jz=0 

OO 



y=o 



Fz (^) fY{y)dy. 
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By assumption, the distributions of AAr_|_iA^^^ and B^v converge almost surely to compactly sup- 
ported measures. Thus, their largest eigenvalues are asymptotically upper-bounded and the support of the 
asymptotic distributions of Y and Z are upper-bounded, i.e. 

3cy > such that Vy > , Fy(y) = 1 {fviy) = 0) 

(79) 

3c^ > such that Vz > , Fz{z) = 1 (fziz) = 0). 

Let a > Cy Cz, then for all < y < Cy, we have ^ > > Cz and Fz = 1, as the dimensions go 
to infinity with constant rates. Therefore, in the asymptotic regime, we have 



Fyzia) = j^'_Jz fY{y)dy 



(80) 



= r Wiy)dy = FY{cy) = i. 

Jy=0 

Combining (1771 ) and (l80l ). we get Fx (a) = I for a > Cy Cz- Thus, there exists a constant Cx such that 
< c^. < Cy Cz and Vx > Cx , Fx{x) = 1, which means that the support of the asymptotic distribution of 
X is upper-bounded. As a consequence, the support of the asymptotic eigenvalue distribution of Btv+i 
is also upper-bounded. Therefore, the support of fiN+i is upper-bounded, which concludes the proof. 



Proof of Lemma [7] 

The proof of Lemma [7| is done by induction. 

We first prove that Lemma^holds for i = 1. To that purpose, we define the matrix B = Ai©i Aq Aq^Qj^ A(^. 
Then 

fci 

tr(E[Ai0iAoA^0f Af ]) = tr(E[B]) = E[bj,] (81) 
The expectation of the j*'^ diagonal element bjj of matrix B is 

k,l,m,n,p 
- \^ ln(l)|2|„(0)|2prm(l)|2i 
k,Lm \ ^ 



(1)|2'^U(0)|2 



k l,m 



where the second equality is due to the fact that E[d 9pn*] = crfSk^pSi^n- It follows from dSTl ) and ( [82l ) 
that 

tr(E[B]) = afY: \a^\' E = cx?tr(Ai Af )tr(AoAo^) (83) 

j,k l,m 
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which shows that Lemma [7| holds for i = 1. 

Now, assuming that Lemma [7| holds for i — 1, we show it also holds for i. We define the matrix 

Then 

tr(E[B,]) = tr(E[A,0iB,_ief Af ]) 

ki (84) 

The expectation of the diagonal element b^^^ of matrix Bj is 

k,l,m,n 

= El42l'E[6!r^)]E0£Q (35^ 

k,l ^2 
k I 

where the second equality is due to the independence of ©j and Bj_i and to the fact that Ei^'^*^^';^^*] = 
<^i^k,p^i,n- Thus (l84l ) becomes 

tr(E[B,]) = af jagp ^ ^^^"'^^ = ^^r(A,Af )tr(E[B,_i]) 

j,k I 

(86) 

= aftr(A,Af )tr(AoA^) J] ^fctr(A,Af ) = tr(AoA^) J] ^^fctr(AfcAf ) 

A:=l fc=l 

which shows that if Lemma [7| holds for i — 1, then it holds for i. 

Therefore Lemma [7| holds for any i > 1, which concludes the proof. ■ 



Appendix II 
Proof of Theorem [T] 

In this appendix, we first list the main steps of the proof of Theorem [7] and then present the detailed 
proof of each step. Note that the proof of Theorem\l\\i&e?, tools from the free probability theory introduced 
in Appendix Jl The proof of Theorem [7] consists of the following four steps. 

1) Obtain Sg„g-(^)- 

2) Use Sg^g%{z) to find Tg„g«(z). 

3) Use Tg„g"(-2) to obtain dl/di]. 
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4) Integrate dl/di] to obtain I itself. 
• First Step: obtain S(^^q^h{z) 

Theorem 3: As ki, i = 0, N go to infinity with the same rate, the S -transform of GatG^ is given 

by 

N ^ f z \ 

Sg.g% (z) = Sm-m„ (z) n ^ (j^^ V- ,M._. [-^J- (87) 
Proof: The proof is done by induction using Lemmas \l} 13 |2]First, we prove (ISTl) for = 1. Note 



that 



therefore 



GiGf = Mi0iMoM^ef Mf (88) 

'S'GiGf(^)= Sq^m„m»&»mpm,{z) , by LemmaU} 

= 'S'siMoMo^ef (^)'S'MfMi(2;) , by Lemma m 

^^•S'MoM^ef ©1 ( -w I 'S'Mf Ml (z) > by Lemma [7] 



^^'S'MoM^f (^-w^ 'S'efei (^-w^ 5'Mf MiI^) > by Lemma^ 



jrk^MoM^ ( ^ ) ^ib. '^MfMi(^) > by Lemma\3\ 

5m^.Mi(^) ii^VfMo {-£) , by Lemma [3 (89) 

Now, we need to prove that if dSTl ) holds for N = q, it also holds for = g + 1. Note that 

Gg+iGf+i = Mg+i0,+iM,0, . . . MiSiMoM^ef Mf . . . ©f Mf ef+iMf+i. (90) 
Therefore, 

= 'S'0,+iM,...Mfef+iMf_,iM,+i(^) > by LemmaU] (91) 

The empirical eigenvalue distribution of Wishart matrices converges almost surely to the Marcenko- 

Pastur law whose support is compact. Moreover, by assumption, the empirical eigenvalue distribution of 
Mj^Mj, i = 0, . . . , + 1 converges to an asymptotic distribution with a compact support. Thus, by 
Lemma\5\ the asymptotic eigenvalue distribution of M^Gg . . . 0^M^ has a compact support. Therefore 
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Lemma |2] can be applied to ([9l1)to show that 



z + 1 



z + 



I-'SM,...Mf0f+,0,+i -T- 5m« ,M,+i(2) , by Lemmam 



z + l 



Z + 



5'Mf+iM,+i(2:) , by Lemma^ 



kq+1 



Z+l 



if . \\\ 



."^^O \ _kq 



n 



1 — + — 



-'S'Mf+iM,+i(^) . by Lemma\3\ 



\ J J 



z -\- 1 „ / \ 

r-'3Mf,,M,+il2;j- 



ttg + l Z+l « M ^ 



2 + 



'S'Mf_^,M,+i(^) n — — 



n 



«=1 fc, 



— rr—'-'Mf ,M,_i -J— 

1=1 feg+i \ A;,+i 



9+1 



-5ivr« 



Pi-l 



The proof is complete. 

. Second Step: use Sg„g^(^) to find Tg„g^(z) 
Theorem 4: Let us define oat+i = 1. We have 



i=0 



Proof: From dSTl l it follows that 



--^G^G^l^) - JIPY'S'm^Mm (2^)11 ~ 



1 ^ L 1 z 



Z+1 ^"^"^ ' z + 

Using (|57] ) in ( |94l ). we obtain 



5'i\/rH 



^ ^ 



1 ^ p - 1 A 



or, equivalently, 



1 ^ 

I -n Pi 



■t=0 
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Substituting z = Tq^qh(s) in ( |96l ). Equation ( [93] ) follows. This completes the proof. ■ 
• Third Step: use Tq^qh(z) to obtain dl/drj 

Theorem 5: In the asymptotic regime, as ko,ki, . . . , go to infinity while ^ pi,i = 0, . . . , N, 
the derivative of the instantaneous mutual information is given by 



dloo 



dr] poln2.^^ 

where ho, hi, . . . , are the solutions to the following set of + 1 equations 



(97) 



TV 

llh,=p,E 

j=0 



i = 0,...,N. (98) 



The expectation in ( [98] ) is over Aj whose probability distribution function is given by i^M^M, ('^) 
(convention: qn+i = 1)- 

Proof: 
First, we note that 

/ = -^-logdet(I + ?7GjvGf) 

ko 

i=l 

i=l 

= Ihi + ^?A)dFG„G- (A) (99) 

where F^" „hM is the (non-asymptotic) empirical eigenvalue distribution of GatG^, that converges 
almost-surely to the asymptotic empirical eigenvalue distribution Fq^(^h, whose support is compact. 
Indeed, the empirical eigenvalue distribution of Wishart matrices 0^0^ converges almost surely to 
the Marcenko-Pastur law whose support is compact, and by assumption, for i G {0, . . . , + 1} the 
empirical eigenvalue distribution of M^Mj converges to an asymptotic distribution with a compact 
support. Therefore, according to Lemma \5\ the asymptotic eigenvalue distribution of GatG^ has a 
compact support. The log function is continuous, thus bounded on the compact support of the asymptotic 
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eigenvalue distribution of GatG^. This enables the application of the bounded convergence theorem to 
obtain the almost-sure convergence in (|99l ). 
It follows from that 

1 f —rjX 
-porjlnl J l-{-r])\ " « 
1 



-por]ln2 



Tg„g-(-^)- (100) 



Let us denote 



t = Tg„g-(-??) (101) 
9i 



T^MfM.lj:) ^ = 0,...,N (102) 



and, for the sake of simplicity, let a = poln 2. From (llOOl ). we have 

dr] 

Substituting s = — r/ in ( [93] ) and using (IIOII ) and (11021 ). it follows that 

N 

Pi 



t = -j^a^^. (103) 



ryi^ = n ^ 9^- (104) 



i=0 "^+1 



Finally, from ( 11021 ) and the very definition of T in ( [56l ). we obtain 



* = ft / T^^^ii^M-M. (A) i = 0,...,N. (105) 

1 - S-iA 



Substituting (fT03l ) in (fT04l ) and (fTOSl ) yields 



and 



Letting 



i-v)''-'' Ui) =\{^9^ (106) 
' T^'^^'^"^' i = 0, . . . , iV. (107) 



pi \ " M' 



it follows from (11061 ) that 



dloo 

a 



(108) 
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Using (fTOSl) and (fT09l ) in (ITOTl) . we obtain 



v[[ hj = Pi / J — ^-^T^Tj^aiTTT'^-^Mf M.(A) i = 0,...,N 



j=0 



or, equivalently, 



N 
j=0 



(110) 



i = 0,...,N. 



(Ill) 



This, along with equation ( 11091 ). complete the proof. ■ 
• Fourth Step: integrate dl/dr] to obtain I itself 

The last step of the proof of Theorem [7] is accomplished by computing the derivative of Jqo in (flVl) 
with respect to t] and showing that the derivative matches ( |97l ). This shows that ( [TT] ) is one primitive 
function of Since primitive functions of differ by a constant, the constant was chosen such 
that the mutual information (ITtI ) is zero when SNR t] goes to zero: lim^_»o looif]) = 0- 

We now proceed with computing the derivative of I^o- If ( fTTl ) holds, then we have (recall a = /9oIn2) 



N 



i=0 



From (11121 ) we have 



die, 



N 



a- 



dr) 



1=0 



N 



ln( l + !^/ifA, 
Pi 



AT 

i=0 



N 



Nrjllhi 



(112) 



j=0 



N N 
i=0 j=0 



N 



N 



j=0 



A./i 



AT 



j=0 



i=0 

AT N 

E n + I E f n I - n - I E s n 

j=0 * j=0 / j=0 



i=0 i=0 



Af , / AT 



N 



N , I N 



hi 

1=0 ' j=0 



i=0 j=0 

N N 

= {N + l)l[hj-Nllh, 

j=0 j=0 

N 

= llh, (113) 

j=0 

where /i- = ^ and the third line is due to ( fTSl ). Equation dW] ) immediately follows from dl 131) . This 
completes the proof. 
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Appendix III 
Proof of Theorem [2] 

In this appendix, we provide the proof of Theorem |2l The proof of this theorem is based on [26, 
Theorem H.l.h] that is reiterated in Lemma HI Note that, [26, Theorem H.l.h] has been used before 
to characterize the source precoder maximizing the average mutual information of single-user [18] and 
multi-user [19] single-hop MIMO systems with covariance knowledge at source, and to obtain the relay 
precoder maximizing the instantaneous mutual information of a two-hop MIMO system with full CSI at 
the relay [9]. We extend the results of [18], [19], [9] to suit the MIMO multi-hop relaying system of our 
concern. 

The proof consists of three following steps. 

• Step 1: Use the singular value decomposition (SVD) UjDjVf^ = A]^{^]^U^j_,^^PiUr,jAy^^ and 
show that unitary matrices U, and Vj impact the maximization of the average mutual information 
through the power constraints only, while diagonal matrices Dj affect both the mutual information 
expression and the power constraints. 

• Step 2: Represent the power constraint expression as a function of Dj,Ui,Vj and channel 
correlation matrices only. 

• Step 3: Show that the directions minimizing the trace in the power constraint are those given in 
Theorem^ regardless of the singular values contained in Dj. 

Before detailing each step, we recall that the maximum average mutual information is given by 

C = max E [logdet(Ifc„ + r? G^vG^)] (114) 

and we define the conventions oq = 1, and Cr,o = Ifco- Note that the latter implies that Ur,o = Ifco ^"^^ 

• Step 1: clarify how the average mutual information depends on the transmit directions and 
the transmit powers 

For i G {1, . . . , N} we define 

0', = u^,ejUi,i (115) 

Since 0j is zero-mean i.i.d. complex Gaussian, thus bi-unitarily invariant, and Ur,j and \Jt,i are unitary 
matrices, 0' has the same distribution as 0^. 
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For iG{0,...,A^ — 1}, we consider the following SVD 



(116) 



where Uj, Vj are unitary matrices, Dj is a real diagonal matrix with non-negative diagonal elements in 
the non-increasing order of amplitude. 

We now rewrite the average mutual information as a function of matrices Uj, Vj and Dj, in order 
to take the maximization in (fTSl ) over Uj, V/ and Dj instead of Pj. Using (II 15b and (11161 ) the average 
mutual information X can be expressed in terms of matrices 0^, Uj, Vj and Dj as 

J^E[logdet(Ifc„+r?G;vG^)] 



E 



logdet(Ifc„ + 77 U,,jvA',^j^ e'^ Ujv-iDjv_iV^_i . . . Uii^iVf G[ UoDqV, 



r,N) 



(117) 

©'■ being zero-mean i.i.d. complex Gaussian, multiplying it by unitary matrices does not change its 
distribution. Therefore, 0'/ = V^0'-Uj_i has the same distribution as 0^ and the average mutual 
information can be rewritten 



T = E 
= E 



logdet(I,.„ + r? Ay^0'^Djv-i0'^_i . . . Di0;'DoD^0;'^Df . . . 0;^ iD^„i07Ay^) 



iogdet(i,„+r?A;/; (g){0^'D,_i} (g){Df_i0;'^} a;/J) 



N 



1/2. 



i=N 



(118) 



(119) 



Therefore, the maximum average mutual information can then be represented as 

1 N 

C= max E logdet(Ifc„ + r? A^^^ (g){0^'D,_i} A]'}) 

D„Ui,Vj L i=N i=i 

tr(E[x,xf ]) < k,Vi 

Vi e {0, . . . , iV - 1} 

Expression dl 181) shows that the average mutual information T does not depend on the matrices Uj and 
Vj, which determine the transmit directions at source and relays, but only depends on the singular values 
contained in matrices Dj. Nevertheless, as shown by (11191 ). the maximum average mutual information C 
depends on the matrices Uj, Vj — and thus on the transmit directions — through the power constraints. 



• Step 2: give the expression of the power constraints in function of Dj,Uj,Vj and channel 
correlation matrices 
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We show hereunder that the average power of transmitted signal Xj at i-th relaying level is given by 
tr(E[x,xf ]) = aM'P^Cr,iPf) H ^tr(Ct,fc+iPfcC,,fcP|^). (120) 

Proof: The average power of transmitted signal Xj can be written as 

1 i 
tr(E[x,xf ]) = tr(E[(g){AfcGfc}AoAo^ (2){e^Af }]) 

k=i k=l 

with 

,1/2 



A, = P,C 



Ak = Mfc = C,i'^,PfcC;/f , VA: G {0, . . . , i - 1} (121) 



2 Ofe 



Applying Lemma^to tr(E{xjX^}) yields 



i-l 



tr(E[x,xf ]) = tr(Ct,iPoC,,oP?) J] -^tv{Ct,k+iPkCr,kP^) -^tr(P,C,,iP^^ 



k=l 



"-fc— 1 n^i— 1 



(122) 



aitr(P,C,,iPf ) H |^tr(Ci,fc+iPfeC,,fePf ) 



which concludes the proof. ■ 
Using (11201 ) in the power constraints (O, those constraints can be rewritten as a product of trace-factors: 

tr(PoP^) < koVo 

i-l (123) 
aM'PiCr,^'Pf) n T^tr(Ct,fc+iPfcC,,fcPf ) < A^iPi , Vi G {1, . . . , - 1}. 

fc=0 

In order to express (11231 ) in function of matrices Uj, Vj and Dj, we first rewrite (II 161) as 

P, = Ut,i+iA-Y^U,D,Vf A-^/'u^f, (124) 

and use (11241 ) in (11231 ) to obtain 

tr(P,C,,Pf ) = tr(Ut,+iA-^2u,D,Vf A-^/^U^, U,,A,,U,^, U,,A-i/V,Df Uf A'^^Uf^+i: 

= tr(A,-,ViU,D2uf ) 
tr(Q,fc+iPfca,fcPi;^) = tr(DfcDl^) 

= tr(D2) 

(125) 
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where Df = D^Df is a real diagonal matrix with non-negative diagonal elements in non-increasing 
order. This leads to the following expression of the power constraints in function of Uj , D j 

tr(A-iiUoDgU^) < koVo 



aitr(A-/, , UiD^uf ) < ■ , ' ' — , £ {2, . . . , N - 1}. 



(126) 

nr=o frtr(D|) 

It was shown in Step 1 that matrices Vj do not have an impact on the expression of the average mutual 
information I dl 181 ). and surprisingly (11261 ) now shows that matrices Vj do not have an impact on the 
power constraints either. In fact, as can be observed from (I126I ). the power constraints depend only on 
matrices Uj and Dj. It should also be noticed that matrix Uj has an impact on the power constraint of 
the i-th relay only. 

• Step 3: give the optimal transmit directions 

To determine the optimal directions of transmission at source, we apply Lemma^to the source power 
constraint (11261 ) tr(A^|UoDQU^) < kQVo, and conclude that for all choices of diagonal elements of Dq, 
the matrix Uq that minimizes the trace tr(A^|UoDQUQ^) is Uq = Ik^- Therefore, the source precoder 
becomes 

Po = Ut,iA7y'DoV^A;o^/'u^o = Ut,iA7i^/'DoV^ 

(127) 

= Ut,iAp„Vo^. 

This recalls the known result (l27l ) in the single-hop MIMO case, where the optimal precoding covariance 
matrix at source was shown [18], [19] to be 

^ E[xox^] = PoP^ = Ut,iAQ.Uj^i. (128) 

Similarly, to determine the optimal direction of transmission at i-th relaying level, we apply Lemma |?] 
to the i-th power constraint: for all choices of diagonal elements of D?, the matrix Uj that minimizes 
the trace tr{A^l^-^JJi'DfJjf) is Uj = If^^. This leads to the precoding matrix at level i 

P, = Ut,+iA,;,Y?D^Vf A-y^U^f,. (129) 

Now since matrices Vj, i G {0, . . . , — 1} have an impact neither on the expression of the average 
mutual information nor on the power constraints, they can be chosen to be equal to identity: Vj = I, i G 
{0,...,A^ — 1}. This leads to the (non-unique but simple) optimal precoding matrices 

Po = Ut,iAp„ 

(130) 

Pj = Ut,,+iAp,U^j 
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— 1/2 —1/2 

with the diagonal matrices Ap^ = -^j^DjA^j containing the singular values of Pj. 

This completes the proof of Theorem |2] ■ 
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Fig. 2. Uncorrelated case: Asymptotic Mutual Information and Instantaneous Mutual Information versus SNR, with K = 
antennas, for single-hop MIMO, 2 hops, and 3 hops 
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Fig. 3. Uncorrelated case: Asymptotic Mutual Information and Instantaneous Mutual Information versus SNR, with K = 
antennas, for single-hop MIMO, 2 hops, and 3 hops 
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Instantaneous Mutual Information vs Number Antennas, SNR = 1 dB, r = 
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Fig. 4. Uncorrelated case: Asymptotic Mutual Information and Instantaneous Mutual Information versus A'jv, at SNR=10 dB, 
for single-hop MIMO, 2 hops, and 3 hops 
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Fig. 5. One-sided exponential correlation case: Asymptotic Mutual Information and Instantaneous Mutual Information 
SNR, with K = 10 antennas, r=0.3, for single-hop MIMO, 2 hops, and 3 hops 
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Fig. 6. One-sided exponential correlation case: Asymptotic Mutual Information and Instantaneous Mutual Information 
SNR, with K = 100 antennas, r=0.3, for single-hop MIMO, 2 hops, and 3 hops 
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Instantaneous Mutual Information vs Number Antennas, SNR = 1 dB, r = 0.3 
4.5 



c 

c 

B 

la 

In 

X3 



4 IX 



5 3.5 



c 
o 

"■4— ' 

E 
i_ 

o 



3 



3 



2.5 



2 



CO 

CO 
c 



1.5 



X 



3 hops 



X< X X 
X 



*. 



* 



+ 



xi- 



++ 
+ 



+ 
+ 



+ 



ps 



1 hop 



+ 



N = 1 hop, Asymptotic 
N = 1 hop. Experimental 
N = 2 hops. Asymptotic 
N = 2 hops. Experimental 
N = 3 hops. Asymptotic 
N = 3 hops. Experimental 



20 



40 60 
K [antennas] 



80 



100 



Fig. 7. One-sided exponential correlation case: Asymptotic Mutual Information and Instantaneous Mutual Information 
Kn, at SNR=10 dB, r=0.3, for single-hop MIMO, 2 hops, and 3 hops 



